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Future crewed missions to Mars present a maintenance logistics challenge that is 
unprecedented in human spaceflight. Mission endurance — defined as the time between 
resupply opportunities — will be significantly longer than previous missions, and therefore 
logistics planning horizons are longer and the impact of uncertainty is magnified. 
Maintenance logistics forecasting typically assumes that component failure rates are 
deterministically known and uses them to represent aleatory uncertainty, or uncertainty that 
is inherent to the process being examined. However, failure rates cannot be directly measured; 
rather, they are estimated based on similarity to other components or statistical analysis of 
observed failures. As a result, epistemic uncertainty — that is, uncertainty in knowledge of the 
process — exists in failure rate estimates that must be accounted for. Analyses that neglect 
epistemic uncertainty tend to significantly underestimate risk. Epistemic uncertainty can be 
reduced via operational experience; for example, the International Space Station (ISS) failure 
rate estimates are refined using a Bayesian update process. However, design changes may re- 
introduce epistemic uncertainty. Thus, there is a tradeoff between changing a design to reduce 
failure rates and operating a fixed design to reduce uncertainty. This paper examines the 
impact of epistemic uncertainty on maintenance logistics requirements for future Mars 
missions, using data from the ISS Environmental Control and Life Support System (ECLS) 
as a baseline for a case study. Sensitivity analyses are performed to investigate the impact of 
variations in failure rate estimates and epistemic uncertainty on spares mass. The results of 
these analyses and their implications for future system design and mission planning are 
discussed. 
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oO = lognormal distribution scale parameter 

T mission endurance 

C = confidence 

Creq = required system-level confidence 

Mm; = mass of item i 

Nj = number of spares provided for item i 
POS; = Probability of Sufficiency for item i 
POS;req = required system-level Probability of Sufficiency 
CCAA Common Cabin Air Assembly 

CDF Cumulative Distribution Function 
CDRA Carbon Dioxide Removal Assembly 
DSH Deep Space Habitat 

ECLS Environmental Control and Life Support 
EF Error Factor 

EMAT Exploration Maintainability Analysis Tool 
ISM In-Space Manufacturing 

ISS International Space Station 

LEO Low Earth Orbit 

LoC Loss of Crew 

MADS Maintenance and Analysis Data Set 
MTBF Mean Time Between Failures 

OGA Oxygen Generation Assembly 

P(LoC) Probability of Loss of Crew 

PDF Probability Density Function 

POS Probability of Sufficiency 

QPA Quantity Per Application 

TCCS Trace Contaminant Control System 
UPA Urine Processor Assembly 

WPA Water Processor Assembly 


I. Introduction 


Rus missions beyond Low Earth Orbit (LEO) present an unprecedented challenge for human spaceflight from 
the perspective of supportability — that is, the set of system characteristics that strongly influence the logistics and 
support required to enable safe and effective operation of systems.!? Support requirements include deterministic 
logistics requirements such as consumables and life-limited items, as well as stochastic requirements for spare parts 
and maintenance supplies in response to random failures. In addition, the amount of crew time required to execute 
maintenance activities must be considered. All current and previous human spaceflight missions have either been of 
short duration (e.g. Space Shuttle, Apollo), or have been long-duration missions with regular resupply from Earth on 
regular, relatively short intervals (e.g. the International Space Station (ISS)). In either case, the overall mission 
endurance, defined as the time between resupply opportunities (including launch and landing),’ has been short, ranging 
from days to months. In contrast, a mission to Mars will require systems that can sustain the crew for an endurance of 
1,100 days, or approximately 3 years.’ In addition, all previous missions have had the option to abort and return to 
Earth in a matter of days in the event of an emergency. This option will not be available on a mission to Mars. The 
combination of very long mission endurance and lack of abort options create a significant challenge for human 
spaceflight supportability and logistics management. Previous supportability strategies, which were often optimized 
for Earth-dependent operation in LEO, will no longer be effective for missions beyond LEO; new strategies must be 
developed to support the spaceflight systems of the future.?>~° 

Supportability is fundamentally a trade between risk and resources, due to the influence of stochastic maintenance 
demands on the need for supportability resources such as spare parts. Specifically, there is a risk that the spare parts 
provided will not be sufficient to cover all maintenance demands that occur during the mission. Spares are provided 
to reduce this risk to acceptable levels. The risk of insufficient spares is driven by both aleatory and epistemic 
uncertainty. Aleatory uncertainty results from natural randomness in the process being studied, while epistemic 
uncertainty results from a lack of knowledge about the process.!° While aleatory uncertainty is typically accounted for 
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in supportability analyses by modeling the amount of time that a component will function before failure as a random 
variable (typically following an exponential distribution'') with a known failure rate, epistemic uncertainty is also 
present in the value of those failure rates. A component’s failure rate cannot be directly measured, but must instead 
be estimated based upon statistical analysis of failure histories and/or comparison to similar components. Testing and 
operational experience provides additional data upon which to base these estimates. For example, aircraft or 
automobile designs typically have a large amount of operational experience gathered from long-term operation of 
many instances of the same unit, and this extensive experience can be very valuable for uncertainty reduction. 
However, when experience is limited or the population of test articles is low — as is typically the case for crewed 
spacecraft — a significant amount of epistemic uncertainty may remain in failure rate estimates. This remaining 
uncertainty can, in turn, have a very significant impact on supportability assessments and logistics forecasting for deep 
space missions.”!? 

Variation in component reliabilities has an asymmetric impact on system-level supportability; put another way, 
the consequences of increased failure rates are greater than the benefits of reduced failure rates for a given item. 
System-level risk is the product of the risk associated with its constituent elements, and is therefore limited by the 
highest-risk element. If a single component has lower-than-expected reliability, the system-level risk at a given sparing 
level may be significantly higher, even if other components are more reliable than expected. As a result, epistemic 
uncertainty in failure rates tends to have a negative overall impact on system characteristics. When epistemic 
uncertainty is neglected, supportability analyses can significantly underestimate the amount of risk associated with a 
given level of maintenance resources. As a result, maintenance logistics forecasts that do not take into account 
epistemic uncertainty may significantly underestimate the amount of logistics required to provide sufficient risk 
coverage to a high degree of confidence.’ 

This paper examines the impact of both aleatory and epistemic uncertainty on maintenance logistics forecasting 
for crewed spacecraft. Both types of uncertainty are defined and their impacts on supportability are described. Models 
for each type of uncertainty are discussed, and a case study is examined with sensitivity analyses to isolate the impacts 
of variations in reliability estimates and uncertainty in those estimates. These results and their implications for human 
spaceflight mission development are then discussed. The remainder of this paper is structured as follows: Section II 
presents background regarding the role of uncertainty in supportability analysis, defining two types of uncertainty 
(aleatory and epistemic) and discussing the impacts that each has on maintenance logistics forecasting. Section III 
presents the modeling methodology used in this paper to examine maintenance logistics requirements as a function of 
both aleatory and epistemic uncertainty for a given mission, and Section IV applies this methodology to examine 
Environmental Control and Life Support (ECLS) maintenance logistics requirements for a notional Mars mission, 
along with sensitivity analyses to characterize the impacts of variations in both types of uncertainty independently. 
Section V discusses the results of this case study in the context of human spaceflight technology development, system 
design, and mission planning, and Section VI presents conclusions. 


II. Background: Uncertainty and Supportability 


Uncertainty is a major driver of maintenance logistics demand, which is in turn a significant driver of overall 
logistics demand for long-endurance missions.”>? Maintenance supplies, typically in the form of spare parts, must be 
provided in sufficient amounts to mitigate risk. On missions without timely abort or resupply options, the provision 
of sufficient spare parts is particularly important for critical systems such as ECLS, since a failure in these systems 
that goes unrepaired due to lack of resources could lead to Loss of Crew (LoC) when contingency options are limited. 
The probability that sufficient spare parts are provided for a given mission is captured by the Probability of Sufficiency 
(POS).!”!3 The overall system POS is the product of the POS for each item within the system. When abort and resupply 
are not available in a timely manner, POS for critical systems provides a bound on the Probability of Loss of Crew 
(P(LoC)):'4 


P(LoC) => 1-— POS (1) 

P(LoC) is a key risk metric used for human spaceflight mission planning,!° and therefore equation 1 links the 

supportability level-metric POS to mission-level risk objectives. POS can be considered the “success probability” for 

maintenance logistics planning; the corresponding failure probability, 1 — POS, is referred to as the maintenance 
logistics contribution to P(LoC). 

Two key types of uncertainty are present in maintenance logistics forecasting: !° 
e = Aleatory uncertainty, which is the result of natural randomness inherent to the process being examined. 
Uncertainty in the result of a coin flip using a coin of a known bias, for example, is aleatory. 


3 
International Conference on Environmental Systems 


e  Epistemic uncertainty, which results from a lack of knowledge about the process. Continuing the 
previous example, if the bias of a given coin is unknown, uncertainty in that value is epistemic. 

Aleatory uncertainty is commonly accounted for in supportability analyses in the form of a failure rate (A) for each 
particular item, sometimes captured as a Mean Time Between Failures (MTBF). Failure rate is the inverse of MTBF, 
but either number serves the same purpose: to parametrize a probability distribution characterizing the random variable 
representing the amount of time that an item will operate before failure. An exponential distribution is typically 
assumed for first-order analyses of random failures,!> and as a result the POS for a specific item, given its failure rate, 
is characterized by the Cumulative Distribution Function (CDF) of a Poisson distribution. '? For more sophisticated 
analyses, such as those enabled by the Exploration Maintainability Analysis Tool (EMAT),>”!® POS for individual 
items and/or the system as a whole can be characterized via Monte Carlo simulation of a given mission. 

However, the direct use of a Poisson distribution (or Monte Carlo analysis, or any other stochastic assessment tool) 
to examine POS, assuming a known failure rate, neglects the epistemic uncertainty present in failure rate estimates. 
The inclusion of epistemic uncertainty introduces the need for an additional characterization of the analysis. 
Confidence (C) is a measure of the fidelity of a given estimate under a given amount of epistemic uncertainty; in this 
context, confidence is defined as the probability that the actual POS is greater than or equal to the estimated one." 


C= P(POSactuat 2 POSestimated) (2) 


Put another way, confidence is the probability that an analysis does not underestimate risk. Both POS and confidence 
are critical considerations for supportability analysis when epistemic uncertainty is present, and failure to include 
epistemic uncertainty in supportability analyses can result in significant underestimates of risk and/or logistics mass 
requirements.° 


III. Methodology 


Many different models, incorporating various levels of fidelity, can be used to examine maintenance logistics 
requirements. These range from simple heuristics that calculate maintenance logistics mass as a percentage of system 
dry mass per year of operation!”!® to sophisticated tools such as EMAT that use Monte Carlo simulations of entire 
missions to consider interactions between items and time-dependent effects.*'® In addition, since many different 
combinations of spares could achieve risk objectives, there is a need to apply an optimization algorithm to find (or 
approximate) the manifest that does so for minimum mass. Increased model fidelity and manifest optimality require 
increased computational resources; therefore, there is a need to balance the desired fidelity of the analysis output with 
the amount of resources to be allocated for it.!? This analysis uses a mid-fidelity Poisson failure model, using Monte 
Carlo simulations to capture epistemic uncertainty, and applies a marginal analysis algorithm to approximate the 
optimal manifest that achieves risk objectives for minimum mass. Marginal analysis (described in greater detail in 
Section III.B below) is an optimization technique commonly used for spares manifests whereby the marginal value of 
each potential addition to the manifest is calculated by dividing the benefits of that item (i.e. reduction in risk) by its 
cost (i.e. mass). The item with the highest marginal value is added to the manifest, and the process is repeated until a 
desired risk objective is met.'? This approach does not include all considerations that may affect the supportability 
analysis, and the optimization approach does not necessarily find the true optimum, but it provides a good tradeoff 
between model fidelity and resource requirements for early-stage analyses such as this. 


A. Model 

The analysis conducted in this paper assumes that failures occur at a constant rate, and therefore a Poisson 
distribution is used to model the distribution of the number of failures that may occur for a given item. POS for a 
specific item, given a failure rate A, is then given by equation 3, where n is the number of spares provided for that 
item and T is the mission endurance. 


~az Atk 


POS = Yige 


(3) 
A Monte Carlo approach is used to capture epistemic uncertainty, sampling failure rates from the failure rate 
uncertainty distribution for 2, which is assumed to be lognormal with Probability Density Function (PDF) given by 
equation 4.!° 


1 —1(m@=n) 
PDF(A) =ze Xo (4) 
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Here u and o are the location and scale parameters of the distribution, respectively. Lognormal distributions are used 
as failure rate prior distributions in the ISS MTBF Bayesian updating process, and are the recommended distribution 
for capturing uncertainty in parameters estimated from aggregate data sources such as component failure 
histories.!°7°?! Each sampled failure rate in the Monte Carlo analysis is used to calculate POS, given that failure rate. 
The resulting set of sampled POS values is then used to determine confidence by calculating the fraction of sampled 
POS values that are above the required POS threshold. 


B. Optimization 

The objective of spares inventory optimization is to determine the number of spares to be provided for each item 
in order to achieve risk objectives (i.e. provide a required POS at a required confidence level) while minimizing the 
total mass of the spares inventory. This optimization problem is defined mathematically as follows 


Minimize ));m,n; (5) 
st, - -P(T,P0S;(,) = POS-o,) = Greg (6) 


where m, is the mass of item i, n; is the number of spares provided for item i, POS;(n;) is the POS for item i givena 
certain number of spares, POS;¢q is the required POS for the system, and C;¢q is the required confidence. In order to 
solve this optimization problem, marginal analysis is used.!* This optimization approach is similar to that previously 
used by Lange and Anderson,” Stromgren et al.,> and Owens and de Weck;!*”? however, in this case the value being 
increased in the marginal analysis algorithm is not POS itself, but rather confidence at an associated POS level. 

For each analysis, a manifest is initialized to a known lower bound for each spare in order to begin the marginal 
analysis algorithm at a partially completed manifest rather than having to build it from scratch, thus saving 
computational time. For each item, this lower bound is equal to either the number of spares required for scheduled 
maintenance (based on life limits) or the number required so that the item achieves POS/confidence requirements 
individually, whichever is larger. Since the overall system POS is the product of the POS for each individual item, all 
of which are between 0 and 1, any manifest for which individual line items do not achieve POS/confidence 
requirements is known to not achieve those requirements as a whole. Once the manifest is initialized at this lower 
bound, the initial confidence level is calculated using the methodology outlined in Section A. Then, for each item, the 
marginal value of an additional spare is determined by calculating the confidence increase associated with providing 
another spare for that item and dividing by the mass of the item. That is, the marginal value of an item is the confidence 
increase per unit mass that another spare for that item could provide. The spare that provides the greatest marginal 
value is added to the inventory, and the process is repeated until the confidence level is above the required threshold 
— that is, until the constraint defined in equation 6 is met. In some cases, during the early phases of marginal analysis, 
all options will have 0 value because no single item is sufficient to enable any of the Monte Carlo simulations to 
achieve the desired POS value. In order to overcome this issue, the algorithm implemented here includes a startup 
phase in which the marginal value for each item is calculated using the increase in median POS resulting from the 
addition of a spare for that item rather than using confidence. This startup phase is continued until confidence reaches 
a predefined startup threshold value, which for this paper was set at 0.01. Once the manifest confidence level rises 
above this value, the standard confidence-based marginal analysis approach is used for the remainder of the manifest 
optimization. 

The end result of marginal analysis is a sequence of points forming a mass-confidence curve. Each discrete point 
along this curve represents a specific manifest along the mass-confidence Pareto frontier. As discussed by Owens and 
de Weck,” these points are Pareto-optimal, but they do not necessarily represent the minimum-mass manifest that 
achieves a given risk requirement. Other, nondominated solutions that achieve the desired POS and confidence for 
lower mass may exist. An algorithm to find the truly optimal solution is described by Owens and de Weck,” but the 
increased computational cost of the Monte Carlo portion of this analysis precludes its use for this paper in its current 
form. However, the piecewise-linear curve defined by the Pareto-optimal points found via marginal analysis does 
place a lower bound on the amount of mass that is required to achieve a given risk objective. Interpolation between 
adjacent points is used to find the mass value at which the Pareto frontier crosses the confidence threshold, which 
provides a slightly optimistic approximation of maintenance logistics mass. While this value slightly underestimates 
the amount of mass that may actually be required, in practice it tends to provide a better approximation of total mass 
than the first Pareto-optimal point above the threshold found via marginal analysis. 
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C. Data Sources 

The ISS Maintenance and Analysis Data Set (MADS) captures the current state of knowledge of failure rates — 
based on initial estimates and Bayesian updates on those estimates using failure histories — for repairable items on 
board the station in terms of a current Bayesian mean MTBF estimate and an Error Factor (EF). EF captures the level 
of uncertainty in the estimate, and is defined as the ratio between the 95" and 50" percentile (or, equivalently, the 50" 
and 5" percentile) values of the distribution. Therefore, and o for the lognormal distribution defined in equation 4 
can be calculated based on the mean failure rate 2,, and the EF using equations 7 and 8.!° 


_ InCF) 
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IV. Case Study: Notional Mars Deep Space Habitat 


As an illustrative example, this paper examines the maintenance logistics mass required to achieve various POS 
and confidence requirements for a notional Mars mission. Mission endurance is assumed to be 1,100 days,* and 
supportability analysis is conducted for Deep Space Habitat (DSH) ECLS systems. The specific systems examined 
include: 

Oxygen Generation Assembly (OGA) 
Carbon Dioxide Removal Assembly (CDRA) 
Common Cabin Air Assembly (CCAA) 
Trace Contaminant Control System (TCCS) 
Urine Processor Assembly (UPA) 
e Water Processor Assembly (WPA) 
System characteristics, including masses, failure rate distributions, Quantity Per Application (QPA), life limits, and 
k-factors are assumed to be ISS-like, and are based on MADS data wherever possible.” If a specific DSH item does 
not appear in MADS, or if no Bayesian update data are available to provide failure rate estimates or EFs, data for that 
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; Example Mass-Confidence Curve 
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Figure 2. Example mass-confidence curve, for a baseline POS of 1 — = Confidence 


levels between 0.01 and 0.99 are captured. Each point represents a specific, Pareto- 
optimal manifest found via marginal analysis, and the line segments between them 
represent the lower bound on the mass required to achieve the corresponding level of 
confidence. 


item are estimated based upon similarity to other items. Maintenance logistics mass requirements are calculated for 
POS values between 0.9 and 0.99999 and confidence levels ranging from 0.1 to 0.9 in increments of 0.1. For each 
assessment, 25,000 Monte Carlo simulations were executed. For comparison, maintenance logistics mass requirements 
were also calculated assuming that failure rates were deterministically known (i.e. neglecting epistemic uncertainty), 
with values based on the Bayesian mean MTBF estimate from MADS. Finally, a sensitivity analysis was conducted 
to examine the impact of variations in failure rates and EFs. 

Figure | shows the maintenance logistics mass required for the notional DSH ECLS systems examined here. Mass 
requirements are shown for specific confidence levels across a range of POS values. In addition, the mass requirements 
calculated when epistemic uncertainty is neglected is shown by the black dotted line. Confidence levels are indicated 
by solid line, with their confidence values labeled in the upper right. For ease of reading, the line corresponding to a 


confidence of 0.5 is bolded. For comparison, a baseline POS of 1 — = or approximately 0.9963 — based upon the 
mean P(LoC) requirement of no greater than 1 in 270 for commercial crew transportation systems~4 — is used to 
examine confidence as a function of spares mass, shown in Figure 2 for confidence levels between 0.01 and 0.99. 
Each dot represents a specific, Pareto-optimal manifest, and the line segments between them represent a lower bound 
on the mass required to achieve the corresponding level of confidence, as described in Section III.B above. The vertical 
dashed line in Figure 1 corresponds to this baseline POS level. 

In addition, two sensitivity analyses are conducted to examine the impact of variations in failure rates and 
uncertainty, assuming the baseline POS described above. Failure rates for all items are varied between 0.1 and 1.2 
times their current values in order to examine the impact of increases and decreases in reliability. Since EF is the ratio 
between the 95" and 50" percentiles of the failure rate uncertainty distribution, it must be greater than or equal to 1 in 
order to be valid. Therefore, sensitivity to changes in EF cannot be examined through a direct multiplier, as was done 
for failure rates. Instead, the distance between the current EF and 1 was varied according to a multiplier, a, thus 
varying the amount of uncertainty in the failure rate estimate. A multiplier of 1 corresponds to the baseline case (no 
change), while a multiplier of 0 would correspond to a case with no uncertainty. The equations relating the sensitivity 
analysis value of failure rates and EFs as a function of this multiplier are: 


N= aa (9) 
EF’ =1+a(1-EF) (10) 


7 
International Conference on Environmental Systems 


Sensitivity to Failure Rate 


12,000 - 12, 
11,000 | 11 
10,000 } 10 ns 
9 
9,000 } 9. 2 
“eo ary) 58 
cs, 8,000} os 8, 
R G 
S 7,000} aS 7 
x al 
6,000 + 6, 
5,000 GF 5,000 } 
V Ba 
LE 
4,000 4.000 | 
3,000 1 


bo 


2 0.4 0.6 0.8 1 1.2 0.2 0.4 0.6 0.8 1 i: 
Multiplier Multiplier 

Figure 3. Sensitivity analysis results, showing maintenance logistics mass requirements when the failure rate 

(left) or error factor (right) for all items in the system are varied according to a multiplier a, as described by 

equations 9 and 10. 


Here A and EF are the original values for the failure rate and EF for a given item, respectively, and A’ and EF' are 
their values after application of the multiplier a. The results of both of these sensitivity analyses are shown in Figure 
3. 


V. Discussion 


A. POS and Confidence Requirement Impacts on Mass 

Figure | shows that both POS and confidence requirements are strong drivers of maintenance logistics mass 
demands. Increased POS results in exponential growth in mass requirements; similarly, mass grows exponentially 
with higher confidence requirements. As an example, at a POS of 0.99, an increase in confidence from 0.7 to 0.8 
requires the addition of 547 kg of spare parts. A further increase in confidence from 0.8 to 0.9, in contrast, requires 
921 kg, nearly 1.7 times as much mass. At higher POS levels, the extra mass required to increase confidence also 
increases. An increase in confidence from 0.7 to 0.8 at a POS of 0.99999 requires an additional 782 kg, and an increase 
from 0.8 to 0.9 requires an additional 1,264 kg. Both POS and confidence are required in order to ensure that the 
system is maintained over the course of the mission, and both correspond to higher logistics mass requirements. 
Mission designers must carefully balance the two. 

Importantly, this analysis reaffirms the results of Stromgren et al.? in showing that analyses that neglect epistemic 
uncertainty seriously underestimate the amount of risk associated with a given level of logistics, or the mass required 
to drive risk down to acceptable levels. As Figure 1 shows, the dotted line indicating the mass requirement estimated 
using deterministic failure rates is significantly below the 0.1 confidence level. A supportability analysis that assumes 
that there is no uncertainty in failure rates may result in a mass allocation that is not even sufficient to provide a 
confidence of 1 in 10 that risk objectives are met. As POS requirements grow, this gap between the deterministic result 
and the result that includes epistemic uncertainty also grows. 


B. Sensitivity to Failure Rate and Epistemic Uncertainty 

Figure 3 shows that reduction in either failure rates or EFs can reduce the mass required to achieve a given 
confidence level. In both cases, the sensitivity is linearly related to the multiplier a, using the relationships defined in 
equations 9 and 10. Higher confidence levels are more sensitive to variation, and as a result the distance between 
confidence contours is reduced as either failure rate or EF are reduced. 

An important effect is observed here regarding the relationship between the mean failure rate value and the impact 
of uncertainty in that value. This sensitivity analysis examined variations in failure rates and EF values independently. 
However, the left side of Figure 3 shows that as failure rate values are decreased, the distance between confidence 
contours also decreases, indicating a reduction in overall uncertainty. This is a result of the way that uncertainty is 
captured. Since EF defines the ratio between the 95" and 50" percentile values of the lognormal distribution 
representing failure rate uncertainty, and the parameters of that distribution — which influence the location of the 
median — are a function of the mean failure rate estimate (see equations 7 and 8), a reduction in mean failure rate 
means that the scale of uncertainty is lower at the same EF. 
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The benefits of reductions in uncertainty are limited by the amount of spares required in the deterministic case, as 
shown on the right side of Figure 3. Put another way, reducing uncertainty does not directly improve the system, but 
rather our understanding of it. The system will experience some unknown number of failures over the course of the 
mission, and will therefore need a corresponding set of spares in order to mitigate risk. Reducing uncertainty does not 
change the number of failures that will actually occur; it simply increases the analyst’s ability to forecast that number 
and lowers the number of spares required to provide the desired confidence. In order to further reduce mass 
requirements below the amount required in the deterministic case, failure rates must also be increased. 

However, improvement in failure rates requires redesign and remanufacture of systems, while uncertainty 
reduction requires only observation. Thus, uncertainty reduction may be less expensive than failure rate reduction. In 
addition, since failure rate reduction requires changes to the system, it may inadvertently introduce new failure modes 
and new uncertainties. Significant design changes may also invalidate previous experience on a system, effectively 
“resetting” uncertainty levels to higher values and incurring the corresponding logistics mass increase. Since 
uncertainty reduction can be purely observational, it does not have the risk of introducing new failure modes, and does 
not have the impact of invalidating previous experience. Reduction in both failure rates and uncertainties may be 
necessary to reduce logistics mass to acceptable levels, and both could be accomplished concurrently, since both 
require operational experience. However, they have different costs, risks, and benefits, and should be weighed 
appropriately by system designers during development and testing. 


C. Implications for System Development 

These analyses demonstrate the significant impact that epistemic uncertainty has on supportability for deep space 
missions. If the uncertainty in failure rate estimates is not taken into account during mission planning, mission risk 
and/or logistics requirements will be severely underestimated. However, an understanding of epistemic uncertainty 
and its impacts allows for more realistic planning, as well as identification of potential mitigation strategies. The most 
straightforward approach is to gain operational experience on systems in a relevant environment for statistically 
relevant periods of time. Depending on the level of acceptable risk, this may mean that systems must be tested for 
time periods much longer than their nominal mission. Therefore, system development activities must take into account 
the need to deliver completed systems, ready for testing, well before they will be applied in a mission-critical setting. 
Mission and campaign timelines must include time to operate and gain experience, or accept much higher risk and/or 
logistics mass requirements and operate with higher levels of uncertainty. 

In addition, the value of design and operational heritage should not be underestimated. Evolutionary systems that 
build upon previous experience will likely have significantly lower uncertainty than completely new systems. 
However, the relevance of operational heritage should not be overestimated either. Designers must take care to 
understand the implications of changes between systems that have operated previously and the current design, and 
update uncertainty estimates accordingly. With careful planning and limited design changes, the use of a heritage 
system may result in logistics requirements lower than those for a new system, even if the new system has improved 
nominal performance, due to reduced uncertainty. System design and architecture decisions must balance the impact 
of increased uncertainty against the proposed performance improvements. 

Finally, new technology may provide opportunities for new approaches to logistics. For example, In-Space 
Manufacturing (ISM) can enable an adaptable approach to spares logistics by manufacturing spare parts on-demand 
when they are needed from common raw materials. This on-demand manufacturing capability reduces logistics 
requirements directly by enabling the benefits of commonality between items of different designs, as long as they are 
constructed from the same material.'*> ISM also has the potential of mitigating the impact of uncertainty on logistics 
requirements, since it allows rebalancing of logistics resources among different types of spares during the mission. 
During the mission, items that exhibit higher-than-expected failure rates require more raw materials, but items that 
exhibit lower-than-expected failure rates require less. Without ISM, mission planners would have to pre-specify which 
spares are provided, and how many of each type; unused mass associated with a particular type of spare is wasted. 
When an ISM capability is available, a supply of raw materials is provided that can be specialized into different types 
of spare parts as needed, providing additional flexibility and reducing the total amount of spares logistics mass required 
to account for uncertainty.'* However, new technology introduces significant new uncertainties. The impacts of the 
new uncertainty and risk introduced by a new technology must be carefully balanced against its potential benefit, and 
the amount of test time required to reduce those uncertainties to acceptable levels must be factored into technology 
development and mission planning timelines. 


D. Assumptions, Limitations, and Future Work 
This analysis assumed that failure rates were constant over time, that repairs are implemented immediately when 
a failure occurs, and that repairs require negligible time to implement. In addition, no distinction is made with regard 
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to when a failure occurs during a mission; the mission itself is not simulated on a day-by-day basis. These simplifying 
assumptions enable the use of Poisson distributions to model spares demand, significantly reducing modeling 
complexity, but they also reduce the fidelity of model results. However, for the purposes of this analysis, which is 
focused on understanding trends and sensitivities in maintenance logistics rather than on calculating exact cargo mass 
values, this level of fidelity is considered sufficient. 

The current modeling approach relies on a Monte Carlo analysis to examine epistemic uncertainty, calculating 
POS for each manifest tens of thousands of times in order to determine confidence. While the assumption that POS 
for each item follows a Poisson distribution allows for rapid calculation of POS, this Monte Carlo approach is still 
time-consuming, especially given that manifest POS must be re-calculated repeatedly as spares are added in the 
marginal analysis algorithm. Future work will seek a closed-form solution (or approximation) to the distribution of 
POS given uncertainty distributions for failure rates in order to accelerate the manifest optimization process. 

The sensitivity analysis in this paper varied parameters (mean failure rate or EF) for all items together by applying 
a multiplier, as described in equations 9 and 10. In application, it is likely that there are specific items for which 
focused effort to reduce failure rates and/or failure rate uncertainty would produce the greatest benefit. Future work 
will seek to identify these high-leverage items through more in-depth sensitivity analyses that examine the impacts of 
variations in failure rate and uncertainty for specific items or sets of items. 

In addition, this analysis examined variations in failure rate and uncertainty independently. However, a key 
underlying factor in both of these parameters is time. Systems that allocate more time for testing and operational 
experience will have the opportunity to lower uncertainty regarding failure rates by observing failures and using the 
resulting data to update estimates. Similarly, observation of failures allows for the correction of those failures, though 
corrective action may increase uncertainty in some areas by introducing changes which negate the applicability of past 
experience to failure rate estimation. Future analyses will investigate the combined effect of variations in both failure 
rate and uncertainty jointly. 

Finally, this analysis did not include any assessment of ISM impacts. Future work will combine ISM models that 
have been described in previous work with this epistemic analysis capability to build a holistic maintenance logistics 
analysis too. 


VI. Conclusion 


The systems that carry humans beyond LEO and out into the solar system will face supportability challenges unlike 
any that have been encountered in past spaceflight experience. They will need to operate independent from Earth, 
without access to timely resupply and without the option of timely abort home, for very long periods of time. A 
significant amount of maintenance resources will likely be required to reduce risk to acceptable levels. Uncertainty in 
parameter values (in this case, failure rates) adds another dimension to risk assessment beyond the pure probability of 
success or failure itself — namely, the confidence in that risk assessment. Analyses that neglect this uncertainty and 
assume that failure rate values are deterministically known will seriously underestimate the amount of risk associated 
with a given level of maintenance logistics, providing results that have very low confidence levels. This may impact 
mission planning by resulting in significant underestimates of the amount of maintenance logistics mass required to 
reduce risk to acceptable levels. Use of modeling techniques to incorporate epistemic uncertainty, such as those 
described in this paper, allows system designers to directly assess both POS and confidence and understand how both 
are impacted by variations in system parameters. 
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